ByteDance Releases Open Source Multi-modal Model BAGE From Image Generation to World Modeling
ByteDance recently officially released its latest open source multi-modal foundation model - BAGEL (Big Advanced Generalized Embodied Learner), starting a new stage for multi-modal AI models with a scale of 7 billion effective parameters. BAGEL performs excellently in key tasks such as image understanding, generation, and editing, and has surpassed current mainstream open source vision-language models (VLM) like Qwen2.5-VL and InternVL-2.5 in multiple standard evaluations.